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ABSTRACT 

Current usage and theory of standard error of 
measurement calls for one standard error of measurement figure to be 
used across all levels of scoring. The study revealed that scoring 
variance across scoring levels is not constant. As scoring ability 
.'Aicreases scoring variance decreases. The assertion that low and high 
scoring subjects will correctly guess the same number of unknown 
questions is incorrect. Low and high scoring subjects correctly guess 
the same percentage of unknown questions. Clearly, low scoring 
subjects are guessing at a greater number of unknown questions and, 
therefore, will have a larger amount of scoring variance. Post hoc 
analysis of response option frequencies and changes of option choice 
was undertaken. Low scoring subjects apparently employ random 
guessing when ansi^ering questions where they do not know the correct 
answer. High scoring subjects seem to employ eeucated guessing when 
answering questions where they do not know the correct answer. Item 
difficulty appeared to be unrelated to the change from random 
guessing to educated guessing; however, the large nuaber of omissions 
by low scoring subjects made the item difficulty effect difficult to 
interprets (Author) 
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Abstract 

A Reassessment of Standard Error of Measurement 

Alan C. Klaas 
Wayne State University 

Current usage and theory of standard error of measurement 
calls for one standard error of measureipent figure to be used 
across all levels of scoring. The study revealed that scoring 
variance across scoring levels is not constant. As scoring ability 
increases scoring variance decreases. 

The assertion that low and high scoring subjects v;ill 
correctly guess the same number of unknown questions is incorrect. 
Low and high scoring subjects correctly guess the same percentage 
of unknown questions. Clearly, low scoring subjects are guer,sing 
at a greater number of unknown questions and, therefore, will 
have a larger amount of scoring variance. 

Post hoc analysis of response option frequencies and changes 
of option cEoice was undertaken. Low scoring subjects apparently 
employ random guessing v;hen answering questions v;here they do not 
know the correct answer. High scoring subjects seem to employ 
educated guessing when answering questions v:here they do not know 
the correct answer. Item difficulty appeared to be unrelated to 
the change from random guessing to educated guessing; however, 
the large number of omissions by lov/ scoring subjects made the 
item difficulty effect difficult to interpret. 
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A Reassessment of Standard Error of Measurement 

Alan C, Klaas 
Wayne State University 



Background 

The theory of standard error of measurement is part of the 
content of many textbooks and courses dealing with tests and 
measurement (Ebel, 1972; Helmsteder, 1964), Standard error theory 
is used by countless numbers of test publishers and test inter-- 
preters in relating to people the results of standardized tests 
(Kenmon- Nelson, 1957; Metropolitan, 1959; Otis--Lennon , 1967), 
A standard error figure is often used to construct score bands 
or confidence intervals into which the subject's score is said 
to fall. The use of such score bands is undertaken so as to 
present as accurate as possible a picture to the subject of what 
his score means and does not mean. 

The use of standard error of measurement by authors of 
textbooks, instructors in tests and measurement courses, and 
test score interpreters is explored in the present paper. 
Specifically examined is the common practice of applying a single 
standard error figure, the standard error at the mean, for all 
subjects regardless of where those subjects scored. Very low 
scoring subjects, middle ability subjects, and very high scoring 
subjects will have scores explained using the same sized scoring 
band or confidence interval. 

Standard c^rror of measurement explains the differences in 
scoring from one test administration to the next administration 
to be the result of random guessing at unknown ansv/ers. A subject' 
obtained score is said to be made up of two parts: true ability, 
v/hich is the number of questions for v/hich the subject knows the 
correct answers; and guessing error, which is the number of 
questions for which the subject docs not know the correct answers 
but was able to guess correctly. A subject's true score is said 
to be made up of two parts: true score, as defined above; and 
perfect random guessing, v;hich is correcfy guessing the number 
of unknown answers equal to the probability of correct guessing 
randomly across all options for the item. Standard error of 
measurement is based upon the assumption of random guessing across 
all options when attempting an itemi for which the subject does 
not know the correct answer (Nunnally, 1967; McNemar, 1969). 

Examples of Standard Error Applications 

When an individual takes a test, he attains a certain score 
which is based upon the number of questions answered correctly. 
If that same person repeats that same test, he will usually get 
a different score. If the process is repeated a third tim.c, a 
score different from the first tv;o testing sessions will likely 
result. Different scores by the same person on the same test 
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create a problem to those people who would like to make some 
decision on the basis of a test score. A single test adminis- 
tration does not yield a perfectly reliable measure. 

To account for inconsistent m.easurement of stable trails, 
the theory of standard error of m.easurement (SEM) has been 
developed. Standard error of m.easurem»ent is used to explain 
why a person will score differently on successive attempts at 
taking the same test. There are two explanations of the SEM: 
the commonly used explanation and the technically accurate 
explanation . 

The comjnon explanation of SE^^ begins with a single attained 
score. Subject A has a rav; score of 60. Perhaps the SEM figure 
for that test is 3 raw score points. The interpreter might explain 
to subject A that there is a 95% chance that the subject's true 
ability test score is betv/een 5^ and 66 raw score points. That 
is to say, the true score is somev/here in an interval about the 
attained score. 

The technically correct explanation of SEM is just the oppo- 
site. Technically speaking, the attained scores of many adminis- 
trations form an interval about the true score. Suppose subject 
B took the test 100 times. He would obtain 100 scores v/hich 
would be formed into a frequency distribution with subject B's 
true score said to be the mean of that distribution. Scoring 
above the mean is explained as high random success of guessing 
at unknown items while scores belov; the mean would be explained 
as low random guessing at unknown items. 

In comjp.on practice only one SEM figure is calculated for 
any test. Textbooks discussing SEH seem to inply , by omission, 
that only one SEM figure is necessary to explain inconsistent 
m,easurement. Apparently the SEM is felt to represent an equal 
size interval for all levels of test scoring ability. Apparently 
random guensing at all item response options is also felt to be 
equal for all levels of scoring. 

The explanation of true ncore comtmonly presented is as 
follows. Suppose a test contains 100 items with each of the items 
being a four option multiple choice question. If subject C has 
a true ability of 60 items, then he will <'ct at least 60 items 
correct. The probability of correctly guessing an unknown iten 
in a four option question. is .25. Therefore, subject CVs true 
score would be said to equal the true ability of 60 items correct 
plus 40 unknown items times .25 probability of correctly guessing 
unknown items (60 + (40 x .25)), or a true score of 70 raw score 
points. Variations in attained scores are said to result fromi 
changes in the success of guessing at a]l options for unknown 
questions. Apparently, again by omission, the theory holds that 
random guessing at all options on unknov/n questions v;i]l exist 
regardless of scoring ability. 
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Methodology 

A school ability test (Henmon-Melson Test of Mental Ability 
Fomi B for grades six through nine) was administered twice to 608 
eighth graders from 19 elementary schools in St. Louis, Missouri 
and Milwaukee, Wisconsin. The interval between administrations 
was three weeks. The subjects were from a v/ide range of socio- 
economic backgrounds, including a wide range of scholastic 
ability, and had experienced previous contact v;ith standardized 
tests. The a priori data analysis procedure consisted of three 
steps. An additional post hoc procedure v/as added after initial 
data analysis. 

First, the subjects were divided into groups based upon their 
scoring performance on the first administration. Subjects scoring 
at about three standard deviations below the mean (-3 SD) form 
one group, labeled -3.00. Subjects scoring at about tv;o standard 
deviations below the mean (-2 SD) form the second group, labeled 
-2.00* The grouping process continued up the score scaJe to 
the final group, which scored at about two standard deviations 
above the mean (-+2 SD) and was labeled +2.00. Thus the groups 
were formed based upon their proxim.ity to each of the six whole 
number standard deviations between -3 SD and +2 SD. (The data 
had a skewness index of -7.17, indicating a negative skew. 
Examination of several papers discussing the effect of the 
attained skewness figure leads to the conclusion that the statistic 
would not be adversely affected by the observed amount of lack of 
normality. However, the presence of the skew v;ould be important 
to avoid in future studies of this type.) 

Second, a well known result of repeated administrations 
of the same test is the phenomenon called practice effect. The 
average (mean or median) score of a group fron a second adminis- 
tration of a test will always be hiqher than the average score 
from the first administration. By subtracting the second 
administration score from each subject's first testing score, a^ 
change score is obtained. The change scores formed the data for 
the study. 

Third, the statistical analysis contained tv;o parts. The 
variance of the change scores for each of the subgroups was 
compared, using Bartlett's Test of Hom^osccdanticity to test if 
the variation of scoring were equal across ail levels of scoring ♦ 
The second part of the statistical analysis was a simple effects 
two-tailed t-test for the equality of mean change scores across 
all levels of scoring (Glass and Stanley, 1970) . 

The post hoc analysis consisted of tv;o option response 
freauency summaries. The first summary was a bivariate option 
frequency response table, listing first testing option selected 
across second testing option selected, further grouped by subgroup 
for each item. The second summary was a simple option response 
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frequency and proportion table, listing first testing, second 
testing, and total for both tests, also grouped by subgroup for 
each item. 

Results 

Table 1 presents the variance for each subgroup for the 
change scores calculated by subtracting the first administration 
score from the second administration score. Not only were the 
subgroup variances found to be unequal, but the variances resulted 
in a definite pattern. The higher subjects scored, the lower was 
their variation of scoring from first to second administration. 



T7VBLE 1 

Results of Bartlett's Test of Ilomoscedauticity for Change 
Scores from Test Cession One to Test Session Two 



Statistic 


- J 


-2 


Subgroup 
-1 0 


+ 1 


+2 


Variance 




65. 59 


65.84 


46.01 32.15 


14.46 


7.39 


Number 




10 


34 


127 204 


17 3 


23 


Chi-square 




79 .95* 










Degrees of 


Freedom = 


5 









*3ignificant at .001 



Table 2 presents the t-valucs and the degrees of freedom for 
all possible t-test comparisons of the mean change score for each 
subgroup. Seven statistically significant differences and eight 
non-significant differences were found. The location of the seven 
significant differences is not rondom, but is a definite location. 
All statistical analyses involving subgroups above the mean (+1.00 
and +2.00) were significant, with the exception of analyses 
involving subgroup ^-3.00, which contained a sm.all n size. The 
trend of the mean change scores to decrease as scoring ability 
increased seemed to occur in exandnation of mean change scores. 
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TABLE 2 

Mean Change Scores and Related t-Values for Test of 



Change Scores 


for Test 


Session 


One to 


Test 


Session 


TV70 


jSubgroup 




-3 


-2 


-1 


0 


+ 1 


+2 


i Cll X 




3.60 


4.26 


3.88 


3.48 


1.82 


0.13 


(d.f.) 


t = 


-0.23 

(42) 












-1.00 
(d.f.) 


t = 


-0.12 

(135) 


0.28 

(159) 










0. 00 
(d.f.) 


t ■■= 


0. 06 

(212) 


0. 70 

(236) 


0.58 

(329) 








+1.00 
(d.f.) 


t = 


1.33 

(181) 


2.74* 

(205) 


3. 35* 

(298) 


3.29* 

(375) 






+2.00 
(d.f.) 


t = 


1.86 

(31) 


2. 35* 

(55) 


2.61* 

(148) 


2.79* 

(225) 


2.05* 

(194) 





*Significant in two-tiailed test at .10 



Three items of the 90 items were chosen as examples of 
option response selection made by the subjects v/hich v;ere 
grouped into subgroups of like scoring ability from within 
the post hoc data analysis procedure. Tables 3 and 4 present 
the option responding for question 15. The difficulty index 
for item 15 was 0.94, an easy item for the entire group of 
subjects. Option 3 is the correct response. Examination of 
the proportion of times each option was selected (Tabic 4) 
reveals that for the lov/est group, -3.00, incorrect responses 
were randomly chosen. Subgroups -2.00 and -1.00 seemed to 
prefer options 1 and 2 in guessing. Subgroups +1.00 and +2.00 
did not miss the item. Even cor an easy item the low scoring 
group was responding randomly. Hov/ever, middle scoring groups 
seemed to have an option preference. That is to say, not all 
of the options were equally attractive to iriiddle scoring 
subjects. Small n size:: of subjects missing the item, however, 
dictate some caution in option preference interpretation for 
question 15. 
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TABLE 3 



Subgroup Test Session One and Test Session T^vo Option Response 
Frequency for Question 15 "3, 6, 9, 12, 18, What 
two numbers should come next? (]) 19 and 20 (2) 21 and 22 



(3) 21 and 24 


(4) 15 and 


12 (5) 


17 


and 16" 








Subgroup 


Second 




First Test 


Responses 






Test 














(n) 


Response 


1 


2 


*3 


A 




Omit 


-3.00 


1 


1 


0 


0 


0 


0 


0 




2 


0 


1 


1 


0 


0 


0 


(10) 


*3 


0 


1 


2 


3 


0 


0 




4 


0 


1 


0 


0 


0 


0 




5 


0 


0 


1 


0 


1 


0 




Omit 


0 


0 


0 


0 


0 


0 


-2.00 


1 


3 


0 


1 


0 


0 


0 




2 


0 


1 


1 


0 


0 


0 




*3 


0 


1 


23 


0 


0 


0 




4 


0 


1 


2 


0 


0 


0 




5 


0 


yj 


1 


n 

u 




0 






0 


0 


0 


0 


0 


0 


-1.00 


' 1 


0 


0 


3 


c 


0 


0 




2 


1 


u 


3 


0 


1 


0 




*3 


1 


5 


109 


0 


1 


0 




4 


0 


1 


2 


0 


0 


0 




5 


0 




n 

u 


n 
u 


n 
u 


0 






0 


0 


0 


0 


0 


0 


0.00 


1 


1 


1 


0 


0 


0 


0 




2 


0 


1 


1 


0 


0 


0 




*3 


1 


6 


187 


2 


0 


0 




4 


0 


0 


3 


1 


0 


0 




5 


0 


0 


0 


0 


c 


0 




111 X \^ 


0 


0 


0 


0 


0 


0 


+ 1.00 


1 


0 


0 


0 


0 


0 


0 




2 


0 


0 


0 


0 


0 


0 


(173) 


*3 


0 


1 


170 


0 


0 


0 




4 


0 


0 


1 


0 


0 


0 




5 


0 


0 


0 


0 


0 


0 




Or.iit 


0 


0 


1 


0 


0 


0 


+ 2.00 


1 


0 


0 


0 


0 


0 


0 




2 


0 


0 


0 


0 


0 


0 


(23) 


*3 


0 


0 


22 


0 


0 


0 




4 


0 


0 


1 


0 


0 


0 




5 


0 


0 


0 


0 


0 


0 




Omit 


0 


0 


0 


0 


0 


0 



*CorrGct response 
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TABLE 4 

Option Response Summary for Question 15 



Sub- 


Te53t 


Frequency 






Response Options 




cjroup 
















Orcit 


(n) 


Se ssion 


P roDO rt i on 


1 


2 


*3 


4 


5 


-3.00 


First 


f 
i. 




1 


3 


4 


1 




0 






P 


. 10 


.30 


.40 


.10 


. 10 


.00 


(10) 


Second 


f 


1 


2 


4 


1 


2 


0 






F 


10 

• X u 


. 20 


.40 


.10 


.20 






otal 


f 


2 


5 


8 


2 


3 


0 








10 


.25 


.40 


.10 


.15 


. 00 


-2 • 00 


Fi rst 


f 


3 


3 


2 0 


0 


0 


0 






P 


. 09 


.09 


.82 


.00 


, .00 


.00 


(34) 


Second 


f 


4 


2 


24 


3 


1 


0 






t 


12 

• X A. 


. ^0 6 


.71 


.09 


.03 


. 00 




Total 


f 


7 


^5 


52 


3 


1 


0 








• 10 


.07 


.76 


. 04 


.01 


. 00 


-1 • 00 


pj rst 


f 


2 


6 


11/ 


u 


/. 


0 






P 


.02 


.04 


.92 


.00 


.02 


.00 


(127) 


Second 


f 


3 


5 


116 


3 


0 


0 






n 
It 


• 02 


.04 


.91 


. 02 


. GO 


.00 




Total 


f 




11 


233 


3 


2 


0 






n 

tr 


. 02 


. 04 


.92 


.01 


.01 


. 00 


0 • 00 


Pi rst 


f 


2 


8 


lyi 




U 


0 






P 


.01 


.04 


.94 


.01 


.00 


.00 


(204) 


Second 


f 


2 


2 


196 


4 


0 


c 






y 


. 01 


. 01 


.96 


. 02 


.00 


. 00 




Total 


f 


4 


10 


387 


7 


0 


0 








. 01 


. 02 


.95 


. 02 


.00 


. 00 




I X i. o U> 


f 

X 


n 


1 

X 


172 


0 


0 


0 






p 


.00 


.01 


.99 


. 00 


.00 


. 00 


(173) 


Second 


f 


0 


0 


171 


1 


0 


1 








. 00 


. 00 


.98 


.01 


.00 


01 




Total 


f 


0 


1 


343 


1 


0 


1 






p 


. 00 


.00 


.99 


.00 


. 00 


. 00 


+ 2 .00 


First 


f 


0 


0 


23 


0 


0 


0 






p 


.00 


. GO 


1.00 


. 00 


.00 


.00 


(23) 


Second 


f 


0 


0 


22 


1 


0 


0 






p 


.00 


.00 


.96 


. 04 


.00 


.00 




Total 


f 


0 


0 


45 


1 


0 


0 






p 


. 00 


.00 


.98 


. 02 


.00 


.00 



4r 

Correct Response 
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Tables 5 and 6 present the option responding for question 60. 
The difficulty index for item 60 was 0.70, with option 3 being 
correct. Subgroup --3. 0 0 subjects who answered the question 
incorrectly seemed to prefer option 5. However, the proportion 
of people in the --3. 0 0 subgroup who omitted the item is very large, 
causing interpretation to be quite difficult. Subgroups -2.00 and 
-1.00 seemed to prefer incorrect options 1, 2, and 4. Subgroup 
0.00 had an equal proportion choosing options 4 and 5, but had 
larger proportions preferring options 1 and 2. Subgroups above 
the mean, +1.00 and +2.00, seemed to have strong tendency to choose 
options 1 and 2. The summary of item 60 is that subjects scoring 
below the mean who were incorrect in responding seemed to be 
responding randomly. Subjects scoring above the mean who incorrectly 
marked the options seemed to have a strong tendency to select 
options 1 and 2. 

Tables 7 and 8 present the option responding for question 
85. The difficulty index for item 85 was 0 . 11. indicating a very 
hard icem. Option 5 was che correct response. Interpretation of 
option selection tendencies for groups scoring below the mean is 
hampered because of the large nuiriber of omissions. Option selection 
for subgroups below the mean was felt to be randor with a slight 
preference for option 2. The preference for option 2 was stronger 
in the 0.00 group with responses for the remaining opt^ion appearing 
to be random. The subgroups above the mean had a marked tendency 
to select options 2 and 5* The sumjrary of itcn\ 85 is that lov; 
scoring subjec^^s seemed to be selecting options randomly, and 
that high scoring subjects seemed to prefer tv;o particular options. 

Conclusions 

Two conclusions have been drawn from the results. The first 
conclusion is that scoring error is not consistent across all 
levels of scoring. To calculate and use one SEli figure for all 
levels cf scoring seems to be an incorrect procedure. The noted 
negative skew in the data creates caution before contending that 
the observed inverse relationship betv;ecn score level and change 
score variance will necessarily alvrays hold true. The study needs 
to be replicated several times in a test--retest setting, r-n 
interesting theoretical observation has suggested itself by the 
observed findings. 

Standard error of measurement is usually explained in terms of 
a subject's random fluctuations in ability to correctly guess the 
answers to questions for which he has no knov/ledgc. Some days the 
subject has a good day guessing and sou^q days he has a bad day 
guessing. The current usage of standard error of measurement 
maintains that the number of unknown questions v;hich a low scoring 
subject can correctly guess is equal to the nunber of unknown 
questions v;hich a high scoring subject can correctly guess. The 
size of the score band or confidence interval in raw score points 
is uniform. 
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TABLE 5 

Subgroup Test Session One and Test Session Two Option Response 
Frequency for Question 60 "The daughter of ray uncle has a son* 
My father is her son's •••• (1) cousin (2) grandfather 
(3) great-uncle (4) great grandfather (5) brother" 



Subgroup 


Second 




First Test 


Responses 






Test 














(n) 


Response 


1 


2 


*3 


4 


5 


Omit 


-3*00 


1 


0 


0 


0 


0 


0 


0 




2 


0 


0 


0 


0 


0 


0 


(10) 


*3 


1 


0 


1 


0 


0 


0 




4 


0 


0 


0 


0 


0 


0 




5 


1 


1 


1 


L 


U 


2 




Omit 


0 


0 


0 


0 


0 


2 


-2 . 00 


1 


1 


2 


2 


0 


0 


0 




2 


0 


2 


1 


0 


0 


2 


( 34) 


*3 


3 


2 


6 


2 


0 


0 




4 


0 


3 


3 


0 


0 


0 




5 


0 


1 


u 


1 

L 




0 




Omit 


0 


0 


0 


0 


0 


3 


-l.OO 


1 


4 


1 


3 


0 


4 


0 




2 


4 


4 


1 


3 


1 


6 


(127) 


*3 


6 


12 


44 


5 


0 


2 




4 


5 


2 


8 


3 


0 


0 




5 


0 








0 


0 




Omit 


0 


0 


2 


1 


0 


3 


0. 00 


1 


4 


1 


10 


1 


0 


0 




2 


1 




5 


2 


1 


0 


(2(^4) 


*3 


10 


11 


125 


4 


3 


0 




4 


1 


3 


4 


1 


1 


0 




5 


3 


0 


4 


3 


3 


0 




Omit 


0 


0 


0 


0 


0 


0 


-f 1.00 


1 


4 


0 


7 


0 


0 


0 




2 


1 


0 


3 


0 


0 


0 


(173) 


*3 


4 


6 


139 


1 


1 


1 




4 


0 


1 


2 


0 


0 


0 




5 


0 


0 


3 


0 


0 


0 




Omit 


0 


0 


0 


0 


0 


0 


■f2.00 


1 


0 


0 


1 


0 


0 


0 




2 


0 


1 


1 


0 


0 


0 


(23) 


*3 


0 


0 


19 


0 


0 


0 




4 


0 


0 


0 


0 


0 


0 




5 


0 


0 


0 


0 


0 


0 




Omit 


0 


0 


1 


0 


0 


0 



*Ccrrect Response 
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TABLE 6 





Option 


Response 


Surma ry 


for 


Question 


60 










Jl X (z!^ UC! 11 y 




Response Options 
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TABLE 7 

Subgroup Test Session One and Test Session Tv;o Option Response 
Frequency for Question 85 ''Circle is to ellipse as square is to: 
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Such a usage seei^s theoretically incorrect. It is not the 
number of correctly quessed unknown questions which is the sar^t, 
but the percentage of correctly guessed inknown ques**ions which iv: 
the sameT Both low and high scorers have an equal probability of 
correctly guessing at unknown questions, hcwevcr, the lev; scoring 
subject is guessing at many more unknown questions than a high 
scoring subject. Consequently, the lov; scoring subject would have 
greater chance for a wider variability in error of measuroraent than 
v;ould the high scoring subject. 

The second conclusion further complicates any discussion of 
SEM intervals. The post hoc analysis of option selection v;as 
complicated by small sam.ple sizes in the l.ov;est subgroup (-3.00) 
and by large response emissions through the middle range of scoring. 
The tendency of high scoring subjects to choose between two options 
was very pronounced. The results suggested the following statement: 
as scoring ability increases the process of chcosina options when 
the item is not known appears to change from the random, guessing 
theory (SEM) to an educated guessing theory. rigure 1 illustrates 
that while low scoring subjects seer, to be responding randomly, high 
scoring subjects seem to choose most likely responses. Perhaps 
high scoring subjects are able to climinotc sorc; options as being 
obviously v/rong and then randomly choose fror the remaining most 
likely options. Eliminating some options irplies that high scorers 
seemi to have some knowledge of the question, but not enough to 
answer correctly. Possessing partial knowledge r?,nd then guessing 
among the remaining choices is defined as educated guessing. 



Figure 1 Illustration of relationship Between Test Scoring /ability 
and Theoretical Interpretation of Guessing at Unknown 
Test Item.s. 
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The above conclusions need vmcli further w rk v.'ii.h ta^t-retest 
data. Thoy ore offered in hopes of st iirul atinci r:uch further study 
of an often taken- for-cjrai-.ted theory - starUi;r(' on or of r.;casurene»tt . 
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